关于生物特征质量评估算法评估的几点思考 Considerations on the Evaluation of Biometric Quality Assessment Algorithms

作者:Torsten Schlett Christian Rathgeb Juan Tapia Christoph Busch

可以使用质量评估算法来估计生物特征样本用于生物特征识别的效用。研究人员通常使用“误差与丢弃特征”(EDC)图以及其中曲线的“部分曲线下面积”(pAUC)值来评估此类质量评估算法的预测性能。EDC曲线取决于错误类型,如“伪不匹配率”(FNMR)、质量评估算法、生物特征识别系统、一组比较,每个比较对应于生物特征样本对,以及对应于起始错误的比较分数阈值。为了计算EDC曲线,基于相关样本的最低质量分数逐步丢弃比较,并计算剩余比较的误差。此外,必须选择丢弃分数极限或范围来计算pAUC值,然后可以使用该值对质量评估算法进行定量排序。本文讨论

Quality assessment algorithms can be used to estimate the utility of a biometric sample for the purpose of biometric recognition. “Error versus Discard Characteristic” (EDC) plots, and “partial Area Under Curve” (pAUC) values of curves therein, are generally used by researchers to evaluate the predictive performance of such quality assessment algorithms. An EDC curve depends on an error type such as the “False Non Match Rate” (FNMR), a quality assessment algorithm, a biometric recognition system, a set of comparisons each corresponding to a biometric sample pair, and a comparison score threshold corresponding to a starting error. To compute an EDC curve, comparisons are progressively discarded based on the associated samples’ lowest quality scores, and the error is computed for the remaining comparisons. Additionally, a discard fraction limit or range must be selected to compute pAUC values, which can then be used to quantitatively rank quality assessment algorithms. This paper discusses and analyses various details for this kind of quality assessment algorithm evaluation, including general EDC properties, interpretability improvements for pAUC values based on a hard lower error limit and a soft upper error limit, the use of relative instead of discrete rankings, stepwise vs. linear curve interpolation, and normalisation of quality scores to a [0, 100] integer range. We also analyse the stability of quantitative quality assessment algorithm rankings based on pAUC values across varying pAUC discard fraction limits and starting errors, concluding that higher pAUC discard fraction limits should be preferred. The analyses are conducted both with synthetic data and with real data for a face image quality assessment scenario, with a focus on general modality-independent conclusions for EDC evaluations.

论文链接:http://arxiv.org/pdf/2303.13294v1

更多计算机论文:http://cspaper.cn/

Related posts